Infrastructure
Introduction
This guide describes the errors that you can encounter with the infrastructure Dais runs on, and how to troubleshoot them.
NOTE: While Dais will report this category of error, these are not Dais errors; the root causes of these issues are infrastructure-related.
NOTE: When reporting an error, please remember to include:
- The steps to reproduce the issue.
- Timestamp (with timezone).
App IdandRegion(if known).- Any other details that could assist with troubleshooting.
External Component Errors
File System Errors
Reported as FileSystemError with HTTP status code 403 or 503.
Troubleshooting Guide
Depending on the CLOUD_BUCKET_STORAGE_PROVIDER used in the deployment, the troubleshooting steps may be different, but the common steps are:
- Ensure the valid service account key is set for the deployment:
azure-storage-account-connection-string.keygcp-storage.json
- Ensure the SA keys are mounted/passed to the services.
- Azure: the 'secretRef' with the Kubernetes secret
azure-storage-creds - GCP: The SA key should be mounted as volume from the secret
cloud-storage-credentials
- Azure: the 'secretRef' with the Kubernetes secret
For 503 errors infrastructure logs should be used for further troubleshooting root cause discovery.
Database Errors
Reported with HTTP status code 503.
Troubleshooting Guide
For database connection issues check the following:
- The database is running and accessible via other tools (e.g.,
psql). - The IP address and credentials of the database specified during setup are correct.
- The port the database is available on matches the one used in the Dais deployments
values.yaml. - The firewall and network policies in use allow connections to the database from Kubernetes services.
App and Docker Registry Errors
Reported as AppError or DockerRegistryError with HTTP status code 403 or 503. Most often encountered when building persistent-services or models.
Troubleshooting Guide
Depending on the PROJECT_SERVICES_BUILD_CACHE_DOCKER_REGISTRY_AUTH_TYPE set during deployment you will need to check:
For GCP the credentials are set in the
gcp-build-cache-registry-credentialsKubernetes secret, based on thegcp-cache-registry-key.jsondeployment file.For 'basic' verification the credentials are set in:
PROJECT_SERVICES_BUILD_CACHE_DOCKER_REGISTRY_USERNAMEandPROJECT_SERVICES_BUILD_CACHE_DOCKER_REGISTRY_PASSWORDPROJECT_SERVICES_BASE_DOCKER_ARTIFACTORY_REGISTRY_USERNAMEandPROJECT_SERVICES_BASE_DOCKER_ARTIFACTORY_REGISTRY_PASSWORD
Internal Dais Components Errors
Internal Dais component errors typically occur at the Core and App levels rather than the Infrastructure level. Troubleshooting guides that detail how to diagnose and resolve errors are these levels are currently in development.